Text mapping: Visualising unstructured, structured, and time-based text collections
نویسندگان
چکیده
Large collections of text documents are increasingly common, both in business and personal information environments. Tools from the field of information visualisation are being used to help users make sense of and extract useful knowledge from such collections. Flat text collections are often visualised using distance calculations between documents and subsequent (distance-preserving) projection. Distance calculations are often based on a vector space of term vectors. Projection is often achieved with a force-directed placement algorithm. Where extra information about a text collection is available, such as a topical hierarchy or some chronological ordering, it can be used to improve a visualisation. This paper gives an overview of text mapping techniques.
منابع مشابه
A Reference-set Approach to Information Extraction from Unstructured, Ungrammatical Data Sources
This thesis investigates information extraction from unstructured, ungrammatical text on the Web such as classified ads, auction listings, and forum postings. Since the data is unstructured and ungrammatical, this information extraction precludes the use of rule-based methods that rely on consistent structures within the text or natural language processing techniques that rely on grammar. Inste...
متن کاملMining Association Rules from Unstructured Documents
This paper presents a system for discovering association rules from collections of unstructured documents called EART (Extract Association Rules from Text). The EART system treats texts only not images or figures. EART discovers association rules amongst keywords labeling the collection of textual documents. The main characteristic of EART is that the system integrates XML technology (to transf...
متن کاملFinding Novel Information in Large, Constantly Incrementing Collections of Structured Data
Project Argus addresses the problem of obtaining novel intelligence from large, constantly incrementing collections of structured data like shipping records, financial transfers, or hospital admission records. Structured data already provides intelligence analysts with a huge amount of important information. The ever-increasing capabilities of techniques to discern structure in currently unstru...
متن کاملBig Scale Text Analytics and Smart Content Navigation
Identifying and exploring relevant content in growing document collections is a challenge for researchers, users, and system providers alike. Supporting this is crucial for companies offering knowledge in the form of documents as their core product. Our demo shows an intelligent way of doing guided research in big text collections, using the collection of the major scientific publisher Springer...
متن کاملKeyword-Based Browsing and Analysis of Large Document Sets
Knowledge Discovery in Databases (KDD) focuses on the computerized exploration of large amounts of data and on the discovery of interesting patterns within them. While most work on KDD has been concerned with structured databases, there has been little work on handling the huge amount of information that is available only in unstructured textual form. This paper describes the KDT system for Kno...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Intelligent Decision Technologies
دوره 2 شماره
صفحات -
تاریخ انتشار 2008